---
title: Jina AI
description: Learn how to use the Jina AI provider.
---

# Jina AI Provider

[patelvivekdev/jina-ai-provider](https://github.com/patelvivekdev/jina-ai-provider) is a community provider that uses [Jina AI](https://jina.ai) to provide text and multimodal embedding support for the AI SDK.

## Setup

The Jina provider is available in the `jina-ai-provider` module. You can install it with

<Tabs items={['pnpm', 'npm', 'yarn', 'bun']}>
  <Tab>
    <Snippet text="pnpm add jina-ai-provider" dark />
  </Tab>
  <Tab>
    <Snippet text="npm install jina-ai-provider" dark />
  </Tab>
  <Tab>
    <Snippet text="yarn add jina-ai-provider" dark />
  </Tab>
  <Tab>
    <Snippet text="bun add jina-ai-provider" dark />
  </Tab>
</Tabs>

## Provider Instance

You can import the default provider instance `jina` from `jina-ai-provider`:

```ts
import { jina } from 'jina-ai-provider';
```

If you need a customized setup, you can import `createJina` from `jina-ai-provider` and create a provider instance with your settings:

```ts
import { createJina } from 'jina-ai-provider';

const customJina = createJina({
  // custom settings
});
```

You can use the following optional settings to customize the Jina provider instance:

- **baseURL** _string_

  The base URL of the Jina API.
  The default prefix is `https://api.jina.ai/v1`.

- **apiKey** _string_

  API key that is being sent using the `Authorization` header.
  It defaults to the `JINA_API_KEY` environment variable.

- **headers** _Record&lt;string,string&gt;_

  Custom headers to include in the requests.

- **fetch** _(input: RequestInfo, init?: RequestInit) => Promise&lt;Response&gt;_

  Custom [fetch](https://developer.mozilla.org/en-US/docs/Web/API/fetch) implementation.
  Defaults to the global `fetch` function.
  You can use it as a middleware to intercept requests,
  or to provide a custom fetch implementation for e.g. testing.

## Text Embedding Models

You can create models that call the Jina text embeddings API using the `.textEmbeddingModel()` factory method.

```ts
import { jina } from 'jina-ai-provider';

const textEmbeddingModel = jina.textEmbeddingModel('jina-embeddings-v3');
```

You can use Jina embedding models to generate embeddings with the `embed` or `embedMany` function:

```ts
import { jina } from 'jina-ai-provider';
import { embedMany } from 'ai';

const textEmbeddingModel = jina.textEmbeddingModel('jina-embeddings-v3');

export const generateEmbeddings = async (
  value: string,
): Promise<Array<{ embedding: number[]; content: string }>> => {
  const chunks = value.split('\n');

  const { embeddings } = await embedMany({
    model: textEmbeddingModel,
    values: chunks,
    providerOptions: {
      jina: {
        inputType: 'retrieval.passage',
      },
    },
  });

  return embeddings.map((embedding, index) => ({
    content: chunks[index]!,
    embedding,
  }));
};
```

## Multimodal Embedding

You can create models that call the Jina multimodal (text + image) embeddings API using the `.multiModalEmbeddingModel()` factory method.

```ts
import { jina, type MultimodalEmbeddingInput } from 'jina-ai-provider';
import { embedMany } from 'ai';

const multimodalModel = jina.multiModalEmbeddingModel('jina-clip-v2');

export const generateMultimodalEmbeddings = async () => {
  const values: MultimodalEmbeddingInput[] = [
    { text: 'A beautiful sunset over the beach' },
    { image: 'https://i.ibb.co/r5w8hG8/beach2.jpg' },
  ];

  const { embeddings } = await embedMany<MultimodalEmbeddingInput>({
    model: multimodalModel,
    values,
  });

  return embeddings.map((embedding, index) => ({
    content: values[index]!,
    embedding,
  }));
};
```

<Note type="tip">
  Use the `MultimodalEmbeddingInput` type to ensure type safety when using multimodal embeddings.
  You can pass Base64 encoded images to the `image` property in the Data URL format
  `data:[mediatype];base64,<data>`.
</Note>

## Provider Options

Pass Jina embedding options via `providerOptions.jina`. The following options are supported:

- **inputType** _'text-matching' | 'retrieval.query' | 'retrieval.passage' | 'separation' | 'classification'_

  Intended downstream application to help the model produce better embeddings. Defaults to `'retrieval.passage'`.

  - `'retrieval.query'`: input is a search query.
  - `'retrieval.passage'`: input is a document/passage.
  - `'text-matching'`: for semantic textual similarity tasks.
  - `'classification'`: for classification tasks.
  - `'separation'`: for clustering tasks.

- **outputDimension** _number_

  Number of dimensions for the output embeddings. See model documentation for valid ranges.

  - `jina-embeddings-v3`: min 32, max 1024.
  - `jina-clip-v2`: min 64, max 1024.
  - `jina-clip-v1`: fixed 768.

- **embeddingType** _'float' | 'binary' | 'ubinary' | 'base64'_

  Data type for the returned embeddings.

- **normalized** _boolean_

  Whether to L2-normalize embeddings. Defaults to `true`.

- **truncate** _boolean_

  Whether to truncate inputs beyond the model context limit instead of erroring. Defaults to `false`.

- **lateChunking** _boolean_

  Split long inputs into 1024-token chunks automatically. Only for text embedding models.

## Model Capabilities

| Model                | Context Length (tokens) | Embedding Dimension | Modalities    |
| -------------------- | ----------------------- | ------------------- | ------------- |
| `jina-embeddings-v3` | 8,192                   | 1024                | Text          |
| `jina-clip-v2`       | 8,192                   | 1024                | Text + Images |
| `jina-clip-v1`       | 8,192                   | 768                 | Text + Images |

## Supported Input Formats

### Text Embeddings

- Array of strings, for example: `const strings = ['text1', 'text2']`

### Multimodal Embeddings

- Text objects: `const text = [{ text: 'Your text here' }]`
- Image objects: `const image = [{ image: 'https://example.com/image.jpg' }]` or Base64 data URLs
- Mixed arrays: `const mixed = [{ text: 'object text' }, { image: 'image-url' }, { image: 'data:image/jpeg;base64,...' }]`
